-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gossipsub v1.1: introduce message signing policy #294
Conversation
|
||
For signing purposes, the `signature` and `key` fields are used: | ||
- The `signature` field contains the signature. | ||
- The `key` field contains the signing key when it cannot be inlined in the source peer ID. | ||
When present, it must match the peer ID. | ||
|
||
The signature is computed over the marshalled message protobuf _excluding_ the key field. | ||
This includes any fields that are not recognized, but still included in the marshalled data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is problematic because protobuf is non-deterministic in its marshalling - the way to write this specification would be to remove mentions of protobuf and instead explicitly outline the marshalling in terms of key-value pairs and field types - it would have to be done carefully such that the result is readable by a compliant protobuf parser - it might also happen by chance that some protobuf encoders produce a valid byte stream, but with unrecognised fields in particular, the likelihood is low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's hoist the protobuf deterministic serialisation discussion to a new issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been recurrent enough that it needs to be a first-class discussion with a solid outcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should specify canonical encoding here to remove all ambiguity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments; I'm going to take the liberty to push to your branch to make a few editorial changes, if you don't mind.
pubsub/README.md
Outdated
@@ -153,13 +153,18 @@ and | |||
|
|||
Messages can be optionally signed, and it is up to the peer whether to accept and forward | |||
unsigned messages. | |||
When the receiver expects unsigned content-based messages, and thus does not expect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind elaborating what does content-based messages
means here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the two paragraphs that follow are better expressed as a bullet. Why don't we introduce the Lax modes here too? We anyway restrict all of this to v1.1, so we might as well introduce all options here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have called it content-addressed
, or maybe you can think of better wording, to describe messages that identify themselves without the author, seqnoi and signature approach, like in eth2. These messages do not need to be signed, as for these it depends on the application layer, which checks contents etc.
pubsub/gossipsub/gossipsub-v1.1.md
Outdated
These fields may negatively affect privacy in content-addressed messaging, | ||
and may need to be strictly enforced in author-addressed messaging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the content-addressed and author-addressed word choice might be a bit confusing. How about "anonymous" and "origin-stamped"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would work as well, maybe worth a short definition somewhere, it may pop up in other places of the pubsub spec as well. E.g. the message-ID customization may refer to the difference.
pubsub/gossipsub/gossipsub-v1.1.md
Outdated
An implementation may choose to support the legacy v1.0 "lax" signing policy, | ||
along with an explicit message authoring option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"origin-stamping" option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this paragraph below the specification of both options. Otherwise we're speaking about v1.0, then v1.1, then v1.0 support in v1.1, back to v1.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, can you include this in the changes you wanted to make? I'm including this line mostly for awareness of backwards compatibility options, but it does not need to be part of the spec otherwise.
pubsub/gossipsub/gossipsub-v1.1.md
Outdated
- Produces messages without the `signature`, `key`, `from` and `seqno` fields. | ||
The corresponding protobuf key-value pairs are absent from the marshalled message, not just empty. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separate these in two sub-bullets. "On the message producer side:" / "On the message receiver side:"
One important point to raise here is that one reason to use gossipsub and not a custom protocol is that it's a generic message transport - we can expect that the gossipsub networks will cross-pollinate - say between ipfs and eth2 for example - and it's plausible that one might wish to be a participant of both. In fact, it's one of the main advantages of using gossipsub: peer diversity is increased. The strict policies here somewhat limit that benefit - it creates, in effect, two different networks. The issue could be mitigated by making sure that implementations support different policies per topic - the way to achieve this would be to move some of the validation up to the application layer which easier can handle the per-topic nuances and instead only have a lax mode in gossipsub where:
on top of that policy, applications could then choose to restrict things further within the topic namespace they occupy - ie ETH2 might choose to not propagate messages with gossipsub signatures on eth2 topics (because we already do a mostly equivalent check / validation by verifying a payload signature). Status is developing a similar protocol on top of gossipsub that also doesn't use the gossipsub signatures (because we negotiate a different scheme which is more appropriate for the status use case - the gossipsub signatures themselves live at the wrong level of abstraction which is why their usefulness is limited, often) |
Is this one of the main benefits? That is a clear benefit of using a cross-pollinated DHT but nodes of different networks are not going to generally subscribe to (or be able to execute validations of) gossipsub topics on an alternate network. |
well, I'd say so, yes - we're paying the protobuf tax, the "optional-seqno-and-signature-tax", we namespace the topics etc in order to achieve this interoperability where multiple applications can share the infrastructure - and it's not so much about validation specifically but alternatives that might have different needs but still be related (L2, a future ETC fork, client extensions etc). If it was only ETH2, a much simpler thing to do would have been to take the research behind gossipsub, even forking it, and run it as a separate libp2p protocol ( |
I think @arnetheduck makes a valid point. If we make these signing policies configurable at the entire gossipsub stack level, it's no longer possible to participate in pubsub topics that adopt different signing policies. That is, you could never have a single node subscribed to both Filecoin (lax) and Eth2 (strict no sign) topics, which does harm interoperability and sabotages the feasibility of very desirable cross-chain applications and use cases. I'm inclined to not merge this PR and iterate on the implementation submitted in libp2p/go-libp2p-pubsub#359 to push these options down to the topic level. That way, when you join to a topic, you specify the policy that applies to that topic. @protolambda I know you are probably backlogged. Maybe someone in the libp2p or IPFS teams could take this on. |
@raulk pushing these options down to a topic level would be great, sounds like a good solution. And yes, quite backlogged with work. I think it is fair to add a notice in this PR that the options can be implemented on a per-topic basis. Then expand on that once the go-libp2p implementation has decided on the best approach. |
@protolambda makes sense. I'll rejig the text, and add an implementation margin note that we can remove once this is pushed down to the topic level in go-libp2p. |
While working on this, we also noticed that this breaks the default message id generation, which relies on sequence number and peer id, is somewhat broken when the fields are not included - ie migrating fully to a content-based hash method when the anonymous mode is enabled is necessary. |
@arnetheduck yes, good point. And so message ID should be per-topic as well (it's kind of already, if you switch based on the topic info within the message). Will update PR now to incorporate formatting + per-topic behavior recommendation. |
@raulk update:
Again, feel free to make editorial choices, squash/rename commits, or any feedback. I would like to keep this PR moving, so we can continue with ethereum/consensus-specs#2060 before the next Eth2 specs release. |
Co-authored-by: Jacek Sieka <[email protected]>
Apologies. Will get to this later today. |
In the default origin-stamped messaging, the fields need to be strictly enforced: | ||
the `seqno` and `from` fields form the `message_id`, and should be verified to avoid `message_id` collisions. | ||
|
||
In content-stamped messaging, the fields may negatively affect privacy: | ||
revealing the relationship between `data` and `from`/`seqno`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the goal of this is. This might be better moved to after "Signature Policy Options", and put under a "Usage indications/notes" section.
I would reword to something like this:
- Origin-stamped messaging: choose
StrictSign
, and use in combination with the defaultmessage_id_fn
. - Content-addressed messaging: enable
StrictNoSign
, with a custom hash-basedmessage_id_fn
. Any signatures would now be part of the payload, and would need to be validated through a custom message validator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was to give some context/intuition why the fields are handled like they are. It could be worded directly as message id approach -> signing policy
directions, like you suggest. Either option works for us.
I'm going to commit my suggestions, and I'll do another pass tomorrow. Also requesting @vyzo's review here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this is a very welcome addition.
However, I think that the signature policy should be part of the pubsub spec and not the gossipsub spec.
There is nothing gossipsub specific about signature policies, and the policy applies to all routing algorithms as signing is handled outside the router -- at least in the reference implementation.
|
||
For signing purposes, the `signature` and `key` fields are used: | ||
- The `signature` field contains the signature. | ||
- The `key` field contains the signing key when it cannot be inlined in the source peer ID. | ||
When present, it must match the peer ID. | ||
|
||
The signature is computed over the marshalled message protobuf _excluding_ the key field. | ||
This includes any fields that are not recognized, but still included in the marshalled data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should specify canonical encoding here to remove all ambiguity.
@@ -134,6 +136,50 @@ message PeerInfo { | |||
} | |||
``` | |||
|
|||
### Signature Policy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that this should be in the gossipsub spec -- it is a generic pubsub policy that applies to all routing algorithms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, will move it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was looking for a place to version it, but since it's not breaking anything necessarily, and applies to pubsub, moving it to the pubsub doc sounds good.
Use gossip signing policy in p2p spec, see libp2p/specs#294
See libp2p/go-libp2p-pubsub#359 for motivation and related discussion.
In short: Eth2 uses content-addressing and does not use these fields. If not content-addressed, signature data should be verified more strictly.
Changes:
Out of scope:
See ethereum/consensus-specs#1981 for Eth2 progress on this work, and clients implementing this functionality in gossipsub implementations.
PS. This is my first contribution directly to the libp2p specs, let me know if you have feedback to details such as formatting, I am happy to make changes.
cc @raulk, implemented the spec changes, review/edits welcome.